Adaptive Submodular Maximization in Bandit Setting
نویسندگان
چکیده
Maximization of submodular functions has wide applications in machine learning and artificial intelligence. Adaptive submodular maximization has been traditionally studied under the assumption that the model of the world, the expected gain of choosing an item given previously selected items and their states, is known. In this paper, we study the setting where the expected gain is initially unknown, and it is learned by interacting repeatedly with the optimized function. We propose an efficient algorithm for solving our problem and prove that its expected cumulative regret increases logarithmically with time. Our regret bound captures the inherent property of submodular maximization, earlier mistakes are more costly than later ones. We refer to our approach as Optimistic Adaptive Submodular Maximization (OASM) because it trades off exploration and exploitation based on the optimism in the face of uncertainty principle. We evaluate our method on a preference elicitation problem and show that non-trivial K-step policies can be learned from just a few hundred interactions with the problem.
منابع مشابه
Diffusion Independent Semi-Bandit Influence Maximization
We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of “seed” users to expose the product to. While prior work assumes a known model of information diffusion, we propose a parametrization in terms of pairwise reachability which makes our framework agnostic to the underlying diffusion...
متن کاملOnline Submodular Set Cover, Ranking, and Repeated Active Learning
We propose an online prediction version of submodular set cover with connections to ranking and repeated active learning. In each round, the learning algorithm chooses a sequence of items. The algorithm then receives a monotone submodular function and suffers loss equal to the cover time of the function: the number of items needed, when items are selected in order of the chosen sequence, to ach...
متن کاملDeterministic & Adaptive Non-Submodular Maximizationvia the Primal Curvature
While greedy algorithms have long been observed to perform well on a wide variety of problems, up to now approximation ratios have only been known for their application to problems having submodular objective functions f . Since many practical problems have non-submodular f , there is a critical need to devise new techniques to bound the performance of greedy algorithms in the case of non-submo...
متن کاملStochastic Submodular Maximization
We study stochastic submodular maximization problem with respect to a cardinality constraint. Our model can capture the effect of uncertainty in different problems, such as cascade effects in social networks, capital budgeting, sensor placement, etc. We study non-adaptive and adaptive policies and give optimal constant approximation algorithms for both cases. We also bound the adaptivity gap of...
متن کاملNon-Monotone Adaptive Submodular Maximization
A wide range of AI problems, such as sensor placement, active learning, and network influence maximization, require sequentially selecting elements from a large set with the goal of optimizing the utility of the selected subset. Moreover, each element that is picked may provide stochastic feedback, which can be used to make smarter decisions about future selections. Finding efficient policies f...
متن کامل